Search CORE

11 research outputs found

Fake Run-Time Selection of Template Arguments in C++

Author: Draayer Jerry P.
Dytrych Tomáš
Langr Daniel
Tvrdík Pavel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/06/2012
Field of study

C++ does not support run-time resolution of template type arguments. To circumvent this restriction, we can instantiate a template for all possible combinations of type arguments at compile time and then select the proper instance at run time by evaluation of some provided conditions. However, for templates with multiple type parameters such a solution may easily result in a branching code bloat. We present a template metaprogramming algorithm called for_id that allows the user to select the proper template instance at run time with theoretical minimum sustained complexity of the branching code.Comment: Objects, Models, Components, Patterns (50th International Conference, TOOLS 2012

arXiv.org e-Print Archive

Louisiana State University

Parallel Solver of Large Systems of Linear Inequalities Using Fourier-Motzkin Elimination

Author: Fritsch Richard
Langr Daniel
Lórencz Róbert
Šimeček Ivan
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 10/02/2017
Field of study

Fourier-Motzkin elimination is a computationally expensive but powerful method to solve a system of linear inequalities. These systems arise e.g. in execution order analysis for loop nests or in integer linear programming. This paper focuses on the analysis, design and implementation of a parallel solver for distributed memory for large systems of linear inequalities using the Fourier-Motzkin elimination algorithm. We also measure the speedup of parallel solver and prove that this implementation results in good scalability

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Algorithm 947: Paraperm-parallel generation of random permutations with MPI

Author: Draayer Jerry P.
Dytrych Tomáš
Langr Daniel
Tvrdík Pavel
Publication venue: LSU Digital Commons
Publication date: 01/01/2014
Field of study

An algorithm for parallel generation of a random permutation of a large set of distinct integers is presented. This algorithm is designed for massively parallel systems with distributed memory architectures and the MPI-based runtime environments. Scalability of the algorithm is analyzed according to the memory and communication requirements. An implementation of the algorithm in a form of a software library based on the C++ programming language and the MPI application programming interface is further provided. Finally, performed experiments are described and their results discussed. The biggest of these experiments resulted in a generation of a random permutation of 241 integers in slightly more than four minutes using 131072 CPU cores

Louisiana State University

Accelerating many-nucleon basis generation for high performance computing enabled ab initio nuclear structure studies

Author: Draayer Jerry P.
Dytrych Tomáš
Langr Daniel
Launey Kristina D.
Publication venue: LSU Digital Commons
Publication date: 01/05/2019
Field of study

We present the problem of generating a many-nucleon basis in SU(3) -scheme for ab initio nuclear structure calculations in a symmetry-adapted no-core shell model framework. We first discuss and analyze the basis construction algorithm whose baseline implementation quickly becomes a significant bottleneck for large model spaces and heavier nuclei. The outcomes of this analysis are utilized to propose a new scalable version of the algorithm. Its performance is consequently studied empirically using the Blue Waters supercomputer. The measurements show significant acceleration achieved with over two orders of magnitude speedups realized for larger model spaces

Louisiana State University

Block Iterators for Sparse Matrices

Author: Daniel Langr
Ivan Šimeček
Tomáš Dytrych
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/10/2016
Field of study

Crossref

Directory of Open Access Journals

Transformation of a nucleon-nucleon potential operator into its su(3) tensor form using GPUS

Author: Draayer Jerry P.
Dytrych Tomáš
Langr Daniel
Launey Kristina D.
Oberhuber Tomáš
Publication venue: LSU Digital Commons
Publication date: 01/03/2021
Field of study

Starting from the matrix elements of a nucleon-nucleon potential operator provided in a basis of spherical harmonic oscillator functions, we present an algorithm for expressing a given potential operator in terms of irreducible tensors of the SU(3) and SU(2) groups. Further, we introduce a GPU-based implementation of the latter and investigate its performance compared with a CPU-based version of the same. We find that the CUDA implementation delivers speedups of 2.27x - 5.93x

Louisiana State University

Efficient parallel evaluation of block properties of sparse matrices

Author: Daniel Langr
Ivan Šimeček
Publication venue: 'Polish Information Processing Society PTI'
Publication date: 01/10/2016
Field of study

Crossref

Directory of Open Access Journals

SU3lib: A C++ library for accurate computation of Wigner and Racah coefficients of SU(3)

Author: Draayer Jerry P.
Dytrych Tomáš
Gazda Daniel
Langr Daniel
Launey Kristina D.
Publication venue: LSU Digital Commons
Publication date: 01/12/2021
Field of study

We present the C++ library SU3lib for accurate computation of SU(3) Wigner coupling and Racah recoupling coefficients. It is built on the efficient mathematical algorithm originally proposed by Draayer and Akiyama [1]. The presented library extends the reach of this algorithm towards large SU(3) irreducible representations and outer multiplicities that were heretofore inaccessible due to floating-point precision errors. As large irreducible representations of SU(3) play an important role in medium- and heavy-mass atomic nuclei, SU3lib expands the scope of approaches to nuclear structure and reactions that rely on available SU(3) coupling-recoupling coefficients. Program summary: Program Title: SU3lib CPC Library link to program files: https://doi.org/10.17632/j977v8v5fp.1 Developer\u27s repository link: https://gitlab.com/tdytrych/SU3lib Licensing provisions: BSD 2-clause Programming language: C++ External libraries: WIGXJPF [3], Boost Nature of problem: Accurate calculation of SU(3)⊃SO(3) and SU(3)⊃SU(2)×U(1) Wigner coupling and Racah recoupling coefficients for arbitrary couplings and multiplicity. Solution method: We adopt the mathematical procedure proposed by Draayer and Akiyama [1], who also provided its implementation as a FORTRAN library [2]. The challenge is to avoid the loss of precision due to cancellation in sums of large alternating terms in transformation between SU(3)⊃SO(3) and SU(3)⊃SU(2)×U(1) schemes, and to compute SU(3)⊃SU(2)×U(1) Wigner coefficients accurately for large outer multiplicities. The present library tackles these challenges by implementing key formulas and data structures as C++ templates and utilizing floating-point data types with extended precision provided by the Boost.Multiprecision library as template arguments. This permits an efficient and accurate computation of SU(3) coefficients even for large SU(3) irreps and outer multiplicities that were heretofore inaccessible. References: [1] J. P. Draayer and Y. Akiyama, J. Math. Phys. 14, 1904 (1973). [2] Y. Akiyama and J. P. Draayer, Comp. Phys. Comm. 5, 405 (1973). [3] H. T. Johansson and C. Forssén, SIAM J. Sci. Comput. 38(1), A376 (2016)

Louisiana State University

Efficient algorithm for representations of U(3) in U(N)

Author: Draayer Jerry P.
Dytrych Tomáš
Langr Daniel
Launey Kristina D.
Tvrdík Pavel
Publication venue: LSU Digital Commons
Publication date: 01/11/2019
Field of study

An efficient algorithm for enumerating representations of U(3) that occur in a representation of the unitary group U(N) is introduced. The algorithm is applicable to U(N) representations associated with a system of identical fermions (protons, neutrons, electrons, etc.) distributed among the N=(η+1)(η+2)∕2 degenerate eigenstates of the ηth level of the three-dimensional harmonic oscillator. A C++ implementation of the algorithm is provided and its performance is evaluated. The implementation can employ OpenMP threading for use in parallel applications. Program summary: Program Title: UNtoU3.h Program files doi: http://dx.doi.org/10.17632/3g4w8f9vdk.1 Licensing provisions: MIT Programming language: C++ Nature of problem: The determination of the complete set of U(3) irreducible representations (irreps) that occurs in a representation of U(N), where N=(η+1)(η+2)∕2 is the degeneracy of the ηth harmonic oscillator shell. Solution method: The resulting set of U(3) irreps is determined by applying a simple difference relation to the U(3) weight distribution of the Gelfand basis states spanning a given U(N) irrep

Louisiana State University